DCU-TCD@LogCLEF 2010: Re-ranking Document Collections and Query Performance Estimation

نویسندگان

  • Johannes Leveling
  • M. Rami Ghorab
  • Walid Magdy
  • Gareth J. F. Jones
  • Vincent P. Wade
چکیده

This paper describes the collaborative participation of Dublin City University and Trinity College Dublin in LogCLEF 2010. Two sets of experiments were conducted. First, different aspects of the TEL query logs were analysed after extracting user sessions of consecutive queries on a topic. The relation between the queries and their length (number of terms) and position (first query or further reformulations) was examined in a session with respect to query performance estimators such as query scope, IDF-based measures, simplified query clarity score, and average inverse document collection frequency. Results of this analysis suggest that only some estimator values show a correlation with query length or position in the TEL logs (e.g. similarity score between collection and query). Second, the relation between three attributes was investigated: the user’s country (detected from IP address), the query language, and the interface language. The investigation aimed to explore the influence of the three attributes on the user’s collection selection. Moreover, the investigation involved assigning different weights to the three attributes in a scoring function that was used to re-rank the collections displayed to the user according to the language and country. The results of the collection re-ranking show a significant improvement in Mean Average Precision (MAP) over the original collection ranking of TEL. The results also indicate that the query language and interface language have more influence than the user’s country on the collections selected by the users.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Rank using Query-Level Rules

Most existing learning to rank methods neglect query-sensitive information while producing functions to estimate the relevance of documents (i.e., all examples in the training data are treated indistinctly, no matter the query associated with them). This is counter-intuitive, since the relevance of a document depends on the query context (i.e., the same document may have different relevances, d...

متن کامل

DCU at CLEF 2006: Robust Cross Language Track

The main focus of the DCU group’s participation in the CLEF 2006 Robust Track in CLEF 2006 was not to identify and handle difficult topics in the topic set per se, but rather to explore a new method of re-ranking a retrieved document set. The initial query is used to re-rank documents retrieved using a query expansion method. The intention is to ensure that the query drift that might occur as a...

متن کامل

TCD-DCU at LogCLEF 2009: An Analysis of Queries, Actions, and Interface Languages

This paper describes the collaborative participation of Trinity College Dublin and Dublin City University in the Log Analysis for Digital Societies (LADS) task of LogCLEF 2009 track. An analysis of multilingual search logs was carried out with the objectives of investigating how users from different linguistic or cultural backgrounds behave in search, and how the discovery of patterns in user a...

متن کامل

Dublin City University at CLEF 2006: Robust Cross Language Track

The main focus of the DCU group’s participation in the CLEF 2006 Robust Track track was to explore a new method of reranking a retrieved document set based on the initial query with a pseudo relevance feedback (PRF) query expansion method. The aim of re-ranking using the initial query is to force the retrieved assumed relevant set to mimic the initial query more closely while not removing the b...

متن کامل

LogCLEF 2011 Multilingual Log File Analysis: Language Identification, Query Classification, and Success of a Query

Since 2009 LogCLEF has been the initiative within the CrossLanguage Evaluation Forum which aims at stimulating research on user behavior in multilingual environments and promote standard evaluation collections of log data. During these editions of LogCLEF, different collections of log dataset were distributed to the participants together with manually annotated query records to be used as a tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010